Discovering Consensus Patterns in Biological Databases

نویسندگان

  • Mohamed Y. Eltabakh
  • Walid G. Aref
  • Mourad Ouzzani
  • Mohamed H. Ali
چکیده

Consensus patterns, like motifs and tandem repeats, are highly conserved patterns with very few substitutions where no gaps are allowed. In this paper, we present a progressive hierarchical clustering technique for discovering consensus patterns in biological databases over a certain length range. This technique can discover consensus patterns with various requirements by applying a post-processing phase. The progressive nature of the hierarchical clustering algorithm makes it scalable and efficient. Experiments to discover motifs and tandem repeats on real biological databases show significant performance gain over non-progressive clustering techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Introduction to Policy Delphi; A tool for discovering the opposing views on health policy issues

Objective: In this review, we investigated various aspects of Policy Delphi technique to make decision-makers more aware of this pertinent method so that they can use it in their policy decisions in their organizations. Information sources and selected methods for study: This study was conducted using a review method and by searching the related literature in databases such as PubMed, Scopus a...

متن کامل

Incremental Mining for Frequent Patterns in Evolving Time Series Datatabases

Several emerging applications warrant mining and discovering hidden frequent patterns in time series databases, e.g., sensor networks, environment monitoring, and inventory stock monitoring. Time series databases are characterized by two features: (1) The continuous arrival of data and (2) the time dimension. These features raise new challenges for data mining such as the need for online proces...

متن کامل

Discovering Sequence Motifs of Different Patterns Parallely using DNA Operations

Discovery of motifs in biological sequences and various types of subsequences in commercial databases have varied applications and interpretations. This paper proposes a new approach to

متن کامل

Mining association rules from biological databases

area such as bioinformatics. This methodology allows the identification of relationships between low-magnitude similarity (LMS) sequence patterns and other well-contrasted protein characteristics, such as those described by database annotations. We start with the identification of these signals inside protein sequences by exhaustive database searching and automatic pattern recognition strategie...

متن کامل

Automated Discovery of Protein Motifs With Genetic Programming

Automated methods of machine learning may prove to be useful in discovering biologically meaningful information hidden in the rapidly growing databases of DNA sequences and protein sequences. Genetic programming is an extension of the genetic algorithm in which a population of computer programs is bred, over a series of generations, in order to solve a problem. Genetic programming is capable of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006